Dataset Selection

a) clustering: select representative samples and remove outliers. clustering based on loss, gradient, etc.

b) data contribution: measure the contribution of each sample

  • The performance difference using or without using this sample

c) learn the weights of training samples: train with weighted loss and test on the validation test